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ABSTRACT 



Exploratory and confirmatory factor analytic techniques are 
compared, and how to conduct a confirmatory factor analysis is reviewed. A 
sampling of "fit" statistics and suggestions for methods to improve models 
for testing are also presented. Exploratory factor analysis is used to 
explore data to determine the number of the nature of factors that account 
for the covariation between variables when the researcher does not have, a 
priori, sufficient evidence to form a hypothesis about the number of factors 
underlying the data. Confirmatory factor analysis is a theory- testing model 
as opposed to a theory-generating method like exploratory factor analysis. In 
confirmatory factor analysis, the researcher begins with a hypothesis prior 
to the analysis. This model specifies which variables will be correlated with 
which factors, and which factors are correlated. The process of confirmatory 
factor analysis is described, and it is emphasized that it is important to 
realize that more than one model may accurately describe the data and that a 
number of fit indices should be used to determine the fit of the various 
models. Methods that may increase the fit of the researcher's model to the 
data are described. (Contains 22 references.) (SLD) 
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Abstract 

This paper presents a brief comparison between exploratory and confirmatory 
factor analytic techniques. The criticisms of exploratory factor analysis follow a 
definition of this method. A definition of confirmatory factor analysis precedes a 
description of the process of conducting a confirmatory factor analysis. A 
sampling of 'Tit statistics" is provided, as well as suggestions for methods to 
improve models for testing. 
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Confirmatory Factor Analysis 

Factor analysis includes a variety of correlational analyses designed to 
examine the interrelationships among variables (Carr, 1 992; Gorsuch, 1 983). 
Summarized in a succinct manner, Daniel (1988) stated that factor analysis is 
"designed to examine the covariance structure of a set of variables and to 
provide an explanation of the relationships among those variables in terms of a 
smaller number of unobserved latent variables called factors" (p. 2). 

Many definitions are offered in the literature for factor analysis. A 
comprehensive definition was provided by Reymont and Joreskog (1993): 

Factor analysis is a generic term that we use to describe a number of 
methods designed to analyze interrelationships within a set of variables or 
objects [resulting in] the construction of a few hypothetical variables (or 
objects), called factors, that are supposed to contain the essential 
information in a larger set of observed variables or objects.. .that reduces 
the overall complexity of the data by taking advantage of inherent 
interdependencies [and so] a small number of factors will usually account 
for approximately the same amount of information as do the much larger 
set of original observations, (p. 71) 

The procedures for factor analysis were first developed early in the 
twentieth century by Spearman (1904). However, due to the complicated and 
time-consuming steps involved in the process, factor analysis was inaccessible 
to many researchers until both computers and user-friendly statistical software 
packages became widely available (Thompson & Dennings, 1993). Regarding 
the utility of factor analysis, Kerlinger (1986) described it as "one of the most 
powerful tools yet devised for the study of complex areas of behavioral scientific 
concern" (p. 689). 
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Exploratory Factor Analysis and Confirmatory Factor Analysis 
Exploratory Factor Analysis 

Two major dichotomies exist regarding factor analysis: exploratory and 
confirmatory. The determination as to which form to use in an analysis is made 
based on the purpose of the data analysis. Exploratory factor analysis is used to 
explore data to determine the number or the nature of factors that account for 
the covariation between variables when the researcher does not have, a priori, 
sufficient evidence to form a hypothesis about the number of factors underlying 
the data. Therefore, exploratory factor analysis is generally thought of as more 
of a theory-generating procedure as opposed to a theory-testing procedure 
(Stevens, 1996). 

Factor analysis is also "intimately involved with questions of validity" 
(Nunnally, 1978, p. 1 12). In the process of determining whether the identified 
factors are correlated, exploratory factor analysis answers the question asked by 
construct validity: Do the scores on this test measure what the test is supposed 
to be measuring? 

Several shortcomings are associated with exploratory factor analysis, 
which are to be addressed; yet, when used appropriately, exploratory factor 
analysis can be helpful to researchers in assessing the nature of relationships 
among variables and in establishing the construct validity of test scores. In 
reality, the majority of factor analytic studies have historically been exploratory 
(Gorsuch, 1983; Kim & Mueller, 1978). Nevertheless, there are those 
researchers who vehemently sing the praises of this method and others who 
equally chastise it. Nunnally (1978) noted that exploratory methods are neither 
"a royal road to truth, as some apparently feel, nor necessarily an adjunct to 
shotgun empiricism, as others claim" (p. 371). 

o 
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Criticisms of exploratory factor analysis 

Several criticisms have been aimed at exploratory factor analysis. The 
first, according to Mulaik (1987), pertains to the perception that exploratory 
factor analysis may "find optimal knowledge" (p. 265). Mulaik made clear that 
'There is no rationally optimal ways to extract knowledge from experience 
without making certain prior assumptions" (p. 265). 

Also, exploratory assumptions may not always honor the relationships 
among the variables in a given data set. The common factor analysis model is a 
linear model, appropriate for only certain kinds of data. Many causal 
relationships are nonlinear. Superimposing a linear relationship will yield results, 
but these results may be misleading. 

In addition, the factor structures yielded by an exploratory factor analysis 
are determined by the mechanics of the method and are dependent on specific 
theories and mechanics of extraction and rotation procedures. This, too, can 
result in inaccurate results. Mulaik (1987) made clear that exploratory 
techniques do not provide any way of indicating when something is wrong with 
one's assumptions, because the technique was designed to fit the data 
regardless. Rather than justifying the "knowledge" produced, exploratory factor 
analysis suggests hypotheses, but does not justify knowledge. 

Another problem with exploratory methods lies in the interpretation of the 
results. The interpretation of factors measured by a few variables is frequently 
complicated (Nunnally, 1978). Mulaik (1972) suggested that the difficulty in 
interpretation often comes about because the researcher lacks prior knowledge 
and therefore has no basis on which to make an interpretation. 

Yet another problem frequently associated with exploratory factor analysis 
is that exploratory factor analysis does not yield generally optimal solutions for 
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the factors or unique interpretations for them, which makes it difficult to justify 
results. In summarizing the utility of exploratory factor analysis, Mulaik (1972) 
stated: 

In a practical sense, there is no question that exploratory factor analysis 
serves a useful purpose in suggesting hypotheses for further research. 

But one must not be misled into thinking that exploratory factor analysis- 
or any exploratory statistical technique, for that matter-is the only way, or 
even the optimal way, available to us to obtain suggestions for 
hypotheses. One's own direct experience with a phenomenon often 
suffices to suggest hypotheses, (p. 269) 

Confirmatory Factor Analysis 

Confirmatory factor analysis is a theory-testing model as opposed to a 
theory-generating method like exploratory factor analysis. In confirmatory factor 
analysis, the researcher begins with a hypothesis prior to the analysis. This 
model, or hypothesis, specifies which variables will be correlated with which 
factors and which factors are correlated. The hypothesis is based on a strong 
theoretical and/or empirical foundation (Stevens, 1 996). 

In addition, confirmatory factor analysis offers the researcher a more 
viable method for evaluating construct validity. The researcher is able to 
explicitly test hypotheses concerning the factor structure of the data due to 
having the predetermined model specifying the number and composition of the 
factors. 

Confirmatory methods, after specifying the a priori factors, seek to 
optimally match the observed and theoretical factor structures for a given data 
set in order to determine the "goodness of fit" of the predetermined factor model. 
Commenting on the utility of confirmatory factor analysis, Gorsuch (1983) stated: 
"Confirmatory factor analysis is powerful because it provides explicit hypothesis 
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testing for factor analytic problems.... Confirmatory factor analysis is the more 
theoretically important-and should be the much more widely used-of the two 
major facto analytic approaches" (p. 134). He specified that exploratory methods 
should be "reserved only for those areas that are truly exploratory, that is, areas 
where no prior analyses have been conducted" (p. 134). 

Confirmatory Factor Analysis Procedure 

The first step in a confirmatory factor analysis requires beginning with 
either a correlation matrix or a variance/covariance matrix or some similar matrix. 
The researcher proposes competing models, based on theory or existing data, 
that are hypothesized to fit the data. The models specify things such as 
predetermination of the degree of correlation, if any, between each pair of 
common factors, predetermination of the degree of correlation between 
individual variables and one or more factors, and specification as to which 
particular pairs of unique factors are correlated. 

The different models are determined by "fixing" or 'freeing" specific 
parameters such as the factor coefficients, the factor correlation coefficients, and 
the variance/covariance of the error of measurement. These parameters are set 
according to the theoretical expectation of the researcher. Gillaspy (1996) 
provided definitions for fixing and freeing variables: 

Fixing a parameter refers to setting the parameter at a specific value 
based on one's expectations. Thus, in fixing a parameter the researcher 
does not allow that parameter to be estimated in the analysis.... Freeing a 
parameter refers to allowing the parameter to be estimated during the 
analysis by fitting the model to the data according to some theory about 
the data. The competing models or hypotheses about the structure of the 
data are then tested against one another, (p. 7) 
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The actual confirmatory factor analysis can be conducted using one of 
several computer programs such as LISREL VII (Joreskog & Sorbom, 1989). 

The competing models are then tested against one another via the computer 
program. The completed analysis yields several different statistics for 
determining how well the competing models fit the data, or explain the 
covariation among the variables. These statistics are referred to as "fit 
statistics". The fit statistics test all of the parameters simultaneously (Stevens, 
1996). These fit statistics are evaluated to determine which predetermined 
model(s) best explain the relationships between the observed and latent 
variables. This process was described by Bentler (1980): 

The primary statistical problem is one of optimally estimating the 
parameters of the model and determining the goodness-of-fit of the model 
to sample data on measured variables. If the model does not fit the data, 
the proposed model is rejected as a possible candidate for the causal 
structure underlying the observed variables. If the model cannot be 
rejected statistically, it is a plausible representation of the causal 
structure, (p. 420) 

Fit Statistics 

As stated previously, the fit statistics test how well the competing models 
fit the data. Stated more eloquently, Mulaik (1987) noted, "a goodness-of-fit test 
evaluates the model in terms of the fixed parameters used to specify the model, 
and acceptance or rejection of the model in terms of the overidentifying 
conditions in the model" (p. 275). Examples of these statistics include the chi 
square/degrees of freedom ratio, the Bentler comparative fit index (CFI) (Bentler, 
1 990), the parsimony ratio, and the Goodness-of-fit Index (GFI) (Joreskog & 
Sorbom, 1989). 

Chi sauare/dearees of freedom ratio 
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The chi square tests the hypothesis that the model is consistent with the 
pattern of covariation among the observed variables. In the case of the chi- 
square statistic, smaller rather than larger values indicate a good fit. The chi- 
square statistic is very sensitive to sample size, rendering it unclear in many 
situations whether the statistical significance of the chi square statistic is due to 
poor fit of the model or to the size of the sample. This uncertainty has led to the 
development of many other statistics to assess overall model fit (Stevens, 1996). 

Another way to describe the chi square goodness of fit statistic is to say 
that it tests the null hypothesis that there is no statistically significant difference 
in the observed and theoretical covariance structure matrices. The chi-square 
statistic has been referred to as a "lack of index fit" (Mulaik, James, Van Alstine, 
Bennet, Lind & Stilwell, 1989) because a statistically significant result yields a 
rejection of the fit of a give model. 

Goodness-of-fit index (GFI) and adjusted aoodness-of-fit index (AGFI) 

The good of fit index "is a measure of the relative amount of variances 
and covariances jointly accounted for by the model" (Joreskog & Sorbom, 1986, 
p. I. 41). This index can be thought of as being roughly analogous to the multiple 
R squared in multiple regression. A model is considered to have a better fit when 
"it has a lower ratio computed as the noncentrality parameter divided by degrees 
of freedom" (Thomas & Thompson, 1994, p. 10). The closer the GFI is to 1.00, 
the better is the fit of the model to the data. 

The adjusted goodness of fit statistic is based on a correction for the 
number of degrees of freedom in a less restricted model obtained by freeing 
more parameters. Both the GFI and the AGFI are less sensitive to sample size 
than the chi square statistic. 

Parsimony ratio 
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One of the goals of science is parsimony, because as William of Occam 
argued, parsimonious solutions are more likely to be true and are therefore 
typically more generalizable. The parsimony ratio, is therefore important when 
interpreting the data. This statistic takes into consideration the number of 
parameters estimated in the model. The fewer number of parameters necessary 
to specify the model, the more parsimonious is the model. By multiplying the 
parsimony ratio by a fit statistic an index of both the overall efficacy of the model 
explaining the covariance among the variables and the parsimony of the 
proposed model is obtained (Gillaspy, 1996). 

Interpreting Confirmatory Factor Aanalvses 

It is important to remember when interpreting the findings from a 
confirmatory factor analysis that more than one model can be determined that 
will adequately fit the data (Biddle & Marlin, 1987; Thompson & Borrello, 1989). 
Therefore, finding a model with good fit does not mean that the model is the 
only, or optimal model for that data. In addition, because there are a number of 
fit indices with which to make comparisons, "fit should be simultaneously 
evaluated from the perspective of multiple fit statistics" (Campbell, Gillaspy, & 
Thompson, 1 995, p. 6). 

When a confirmatory analysis fails to fit the observed factor structure with 
the theoretical structure, the researcher can evaluate ways to improve the model 
by exploring which parameters might be freed that had been fixed and which 
might be fixed that had been freed. The computer packages can be utilized to 
change parameters one at a time in order to determine what changes offer the 
greatest amount of improvement in the fit of the model. 
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Summary 

The present paper illustrated the difference between exploratory and 
confirmatory factor analyses. The shortcomings of exploratory methods were 
provided. It was indicated that confirmatory factor analysis is advantageous over 
exploratory factor analysis as CFA allows the researcher to test numerous 
competing hypotheses regarding the factors underlying the data. The process of 
confirmatory factor analysis of data was described. It was emphasized that it is 
important to realize that more than one model may accurately describe the data 
and that a number of fit indices should be used to determine the fit of the various 
models. Finally, methods available to increase the fit of the researcher’s model 
to the data were explained. 
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